Daniel (Danny) Tarlow
Wednesday 20th November 2013
Time: 4pm
Basement Seminar Room
Alexandra House, 17 Queen Square, London, WC1N 3AR
Learning to Generate Natural Source Code
Natural source code is source code that is written by and meant to be understood by humans. I'll talk about recent efforts to build generative models that (a) capture the structure present in source code, and (b) can be learned efficiently from large repositories of existing code. Our approach builds upon the fast training of neural probabilistic language models work of Mnih & Teh (2012), but incorporates hierarchical structure and additional source code-specific structure. Empirically, our new models substantially outperform existing language models in terms of log probability of held out data, and samples from the learned models look more like real source code.